AITopics | chaudhuri and dasgupta

Collaborating Authors

chaudhuri and dasgupta

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Near-Exponential Convergence Rates for kNN Classification based on Boltzmann Margin

Yang, Luyuan, Shafaei, Shayan, Lan, Chao

arXiv.org Machine LearningJun-10-2026

Convergence-rate analysis for classifiers is often conducted under either Tsybakov margin or Massart margin. The former is a relatively weak condition that typically yields polynomial rates, while the latter is substantially stronger but can guarantee exponential rates. In this paper, we introduce a new condition, called Boltzmann margin, that bridges the gap between these two regimes. It is weaker than Massart margin, generally stronger than Tsybakov margin, and can imply many of their properties under suitable conditions. We apply Boltzmann Figure 1: Example data densities on [0,1] that satisfy different margins respectively. Bayes decision boundary is 0.5.margin to the analysis of kNN classifiers and establish the first near-exponential convergence rates for kNN classification. We also present extensions of the main results and provide numerical evidencenecessarily strong for many problems. Can there be a more supporting the main theoretical implications.

artificial intelligence, boltzmann margin, machine learning, (19 more...)

arXiv.org Machine Learning

2606.10361

Country: North America > United States > Oklahoma (0.28)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.34)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.34)

Add feedback

Cluster Trees on Manifolds

Neural Information Processing SystemsMar-13-2024, 20:18:23 GMT

In this paper, we study the problem of estimating the cluster tree of a density when the density is supported on or near a manifold.

algorithm, chaudhuri and dasgupta, cluster tree, (15 more...)

Neural Information Processing Systems

Country:

North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.05)
North America > United States > New York > New York County > New York City (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

Statistical Optimality of Interpolated Nearest Neighbor Algorithms

Xing, Yue, Song, Qifan, Cheng, Guang

arXiv.org Machine LearningNov-30-2018

In the era of deep learning, understanding over-fitting phenomenon becomes increasingly important. It is observed that carefully designed deep neural networks achieve small testing error even when the training error is close to zero. One possible explanation is that for many modern machine learning algorithms, over-fitting can greatly reduce the estimation bias, while not increasing the estimation variance too much. To illustrate the above idea, we prove that the proposed interpolated nearest neighbor algorithm achieves the minimax optimal rate in both regression and classification regimes, and observe that they are empirically better than the traditional $k$ nearest neighbor method in some cases.

artificial intelligence, assumption, machine learning, (15 more...)

arXiv.org Machine Learning

1810.02814

Country: North America > United States > Indiana > Tippecanoe County (0.14)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Nearest Neighbor Methods (1.00)

Add feedback

Minimax rates for cost-sensitive learning on manifolds with approximate nearest neighbours

Reeve, Henry WJ, Brown, Gavin

arXiv.org Machine LearningMar-1-2018

We study the approximate nearest neighbour method for cost-sensitive classification on low-dimensional manifolds embedded within a high-dimensional feature space. We determine the minimax learning rates for distributions on a smooth manifold, in a cost-sensitive setting. This generalises a classic result of Audibert and Tsybakov. Building upon recent work of Chaudhuri and Dasgupta we prove that these minimax rates are attained by the approximate nearest neighbour algorithm, where neighbours are computed in a randomly projected low-dimensional space. In addition, we give a bound on the number of dimensions required for the projection which depends solely upon the reach and dimension of the manifold, combined with the regularity of the marginal.

artificial intelligence, machine learning, nearest neighbour, (15 more...)

arXiv.org Machine Learning

1803.0031

Country: Europe > United Kingdom (0.28)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.83)

Add feedback

Beyond Hartigan Consistency: Merge Distortion Metric for Hierarchical Clustering

Eldridge, Justin, Belkin, Mikhail, Wang, Yusu

arXiv.org Machine LearningJul-13-2015

Hierarchical clustering is a popular method for analyzing data which associates a tree to a dataset. Hartigan consistency has been used extensively as a framework to analyze such clustering algorithms from a statistical point of view. Still, as we show in the paper, a tree which is Hartigan consistent with a given density can look very different than the correct limit tree. Specifically, Hartigan consistency permits two types of undesirable configurations which we term over-segmentation and improper nesting. Moreover, Hartigan consistency is a limit property and does not directly quantify difference between trees. In this paper we identify two limit properties, separation and minimality, which address both over-segmentation and improper nesting and together imply (but are not implied by) Hartigan consistency. We proceed to introduce a merge distortion metric between hierarchical clusterings and show that convergence in our distance implies both separation and minimality. We also prove that uniform separation and minimality imply convergence in the merge distortion metric. Furthermore, we show that our merge distortion metric is stable under perturbations of the density. Finally, we demonstrate applicability of these concepts by proving convergence results for two clustering algorithms. First, we show convergence (and hence separation and minimality) of the recent robust single linkage algorithm of Chaudhuri and Dasgupta (2010). Second, we provide convergence results on manifolds for topological split tree clustering.

artificial intelligence, cluster tree, machine learning, (16 more...)

arXiv.org Machine Learning

1506.06422

Country: North America > United States (0.46)

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)

Add feedback

Cluster Trees on Manifolds

Balakrishnan, Sivaraman, Narayanan, Srivatsan, Rinaldo, Alessandro, Singh, Aarti, Wasserman, Larry

Neural Information Processing SystemsDec-31-2013

We investigate the problem of estimating the cluster tree for a density $f$ supported on or near a smooth $d$-dimensional manifold $M$ isometrically embedded in $\mathbb{R}^D$. We study a $k$-nearest neighbor based algorithm recently proposed by Chaudhuri and Dasgupta. Under mild assumptions on $f$ and $M$, we obtain rates of convergence that depend on $d$ only but not on the ambient dimension $D$. We also provide a sample complexity lower bound for a natural class of clustering algorithms that use $D$-dimensional neighborhoods.

algorithm, artificial intelligence, machine learning, (17 more...)

Neural Information Processing Systems

Country: North America > United States (0.28)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.49)

Add feedback